A QUENCHED LARGE DEVIATION PRINCIPLE AND A PARISI 
FORMULA FOR A PERCEPTRON VERSION OF THE GREM. 



ERWIN BOLTHAUSEN AND NICOLA KISTLER 



Abstract. We introduce a perception version of the Generalized Random Energy Model, and 
prove a quenched Sanov type large deviation principle for the empirical distribution of the 
random energies. The dual of the rate function has a representation through a variational 
formula which is closely related to the Parisi variational formula for the SK-model. 



Dedicated to Jiirgen Gartner on the occasion of his 60th birthday. 
1. Introduction 

There has been important progress in the mathematical study of mean field spin glasses over 
the last 10 years. By results of Guerra [10] and Talagrand [13], the free energy of the Sherrington- 
Kirkpatrick model is known to be given by the formula predicted by Parisi [9]. Furthermore, 
the description of the high temperature is remarkably accurate, see [13] and references therein. 
On the other hand, results for the Gibbs measure at low temperature are more scarce and are 
restricted to models with a simpler structure, like Derrida's generalized random energy model, 
the GREM, [5] and [8], the nonhierarchical GREMs [2] and the p-spin model with large p [13] . 
To put on rigorous ground the full Parisi picture remains a major challenge, and even more so 
in view of its alleged universality, at least for mean-field models. 

We introduce here a model which hopefully sheds some new light on the issue. 

In this paper we derive the free energy, which can be analyzed by large deviation techniques. 
The limiting free energy turns out to be given by a Gibbs variational formula which can be 
linked to a Parisi-type formula by a duality principle, so that it becomes evident why an infi- 
mum appears in the latter. This duality also gives an interesting interpretation of the Parisi 
order parameter in terms of the sequence of inverse of temperatures associated to the extremal 
measures from the Gibbs variational principle. 

In a forthcoming paper, we will give a full description of the Gibbs measure in the thermo- 
dynamic limit in terms of the Ruelle cascades. 

2. A Perceptron version of the GREM 

Let {X a ,i} ae -£ N i<j<jv ' k e ran dom variables which take values in a Polish space S equipped 
with the Borel cr-field S, and defined on a probability space (Q, T, P) . We write M.f (S) for the 
set of probability measures on (S,S) , which itself is a Polish space. S^r is exponential in size, 
typically |£jv| = 2^. It is assumed that all X a i have the same distribution fx, and that for any 
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fixed a G Stv, the collection {X at i} 1<i<N is independent. It is however not assumed that they 
are independent for different a. The perceptron Hamiltonian is defined by 

N 

H N , U (a) d ^ f ]T </> (X aA (to)) , (2.1) 

i=l 

where 4> '■ S — > M is a measurable function. One may allow that the index set for i is rather 
{1, . . . , [aiV]} with a some positive real number, but for convenience, we always stick to a = 1 
here. The case which is best investigated (see [J3]) takes for a spin sequences: a = (a±, . . . , a^) € 
{-1,1} ,S = R, and the X a i are centered Gaussians with 

1 N 

E [X a ^X a ',i') = &i,i'jT '^2 a j a j- ( 2 - 2 ) 

i=i 

This is closely related to the SK-model, and is actually considerably more difficult. The model 
has been investigated by Talagrand [T3], but a full Parisi formula for the free energy is lacking. 
The Hamiltonian (|2.ip can be written in terms of the empirical measure 

def 1 N 

L N ,a = -jj^Sxr^ (2.3) 
1=1 

i.e. 

-H N)U) (a) = N J 4> (x) Ljv iQ (dx) . 
The quenched free energy is the almost sure limit of 

— log ^2 ex P [-Hn,w (a)] , 

a 

and it appears natural to ask if this free energy can be obtained by a quenched Sanov type large 
deviation principle for L^ t0l in the following form: 

Definition 2.1. We say that {L^} satisfies a quenched large deviation principle (in short 
QLDP) with good rate function J : A4f (S) — > [—00,00), provided the level sets of J are compact, 
and for any weakly continuous bounded map $ : Aif (S) — > M, one has 

lim -J- log V exp[N$(L Nt0t )] =log2+ sup [$ (y) - J (u)], , P-a.s. 

aSEjv ueMJ{S) 

The annealed version of such a QLDP is just Sanov's theorem: 

lim ■llogVEexp[iV$(L J v,a)] = log2+ lim — log E exp [iV$ (Ljsr a )] 

N— >oo 1\ * — ' N^oo 1\ 

a 

= log2 + sup($(i/)-iT(4u)) 

v 

where H (v\p) is the usual relative entropy of u with respect to fi, the latter being the distribution 
of the X a) i : 

[ 00 otherwise 
There is no reason to believe that H (v\a) = J (v) . 

Conjecture 2.2. The empirical measures {L_/v,o} with \2. 2\) satisfy a QLDP. 
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We don't know how this conjecture could be proved, nor do we have a clear picture what 
J should be in this case. The only support we have for the conjecture is that it is true in a 
perceptron version of the GREM, a model we are now going to describe. 

For n G N, a = (ati, . . . , a n ) with 1 < a k < 2^ N J2k 7fc = 1, and 1 <i < N, let 

X ■ = ( X 1 X 2 x n \ 

where the X 3 are independent, taking values in some Polish Space (S, S) with distribution fj,j. 
For notational convenience, we assume that the jiN are all integers. Put 

3 

„ dcf \ ^ 

r i = 2^1 r 

k=l 

We assume that all the variables in the bracket are independent. The X a ^ take values in S n . 
The distribution is 

dcf ^ 

jU = fj,i <g> ■ ■ ■ ® fi n 

The empirical measure Ljvq is defined by (|2.3p which is a random element in A4f(S n ). n is 
fixed in all we are doing. 

Given a measure v G A4f(S n ), and 1 < j < n, we write v^> for its marginal on the first j 
coordinates. We define subsets TZj of A4f(S n ), 1 < j < n by 

TZj = L G Mf{S n ) : H (y® | < r i log2} . 

We will also consider the sets 

TZJ = iv G Mf(S n ) : H (yV) \ flA = Tj log 2} . 

For v G Mf{S n ) let 

J(») = 



00 otherwise 

It is evident that J is convex and has compact level sets. 
Our first main result is: 

Theorem 2.3. {LatoJ satisfies a QLDP with rate function J. 

For the rest of this section, we will focus on linear functionals, &(i>) = J <j)(x)v(dx), for a 
bounded continuous function eft : S n — > R. For a probability measure v on S n , we set 

Gibbs(0, v) = f / (j){x)v(dx) — H(u \ ji), 



and define the Legendre transform of J by 

J* (0) = sup / (j)[x)v{dx) — J (v) 



sup |Gibbs(0, v) : v G f^j 1 



whenever the a. s. -limit exists. As a corollary of Theorem 12.31 we have 
Corollary 2.4. Assume that (j) : S — > R is bounded and continuous. 



lira — log V exp <f> (X a4 ) 

Af-5>oo TV '1=1 



J* ((f)) + log 2, a.s. 
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We next discuss a dual representation of J* {<p). Essentially, this comes up by investi- 
gating which measures solve the variational problem. Remark that without the restrictions 
v G f]j=i T^ji we would simply get 

as the maximizer. 

Let A be the set of sequences m = (mi, . . . ,m n ) with < m\ < m,2 < • • • < m n < 1. For 

m € A, and <f> : S n — > R bounded, we define recursively functions (frj, < j < n, 4>j : — > R, 
by 

0n = <t>, (2.4) 

^j_i(xi,...,Xj_i) d = — log / exp[rrij(f)j(xi,...,Xj-i,Xj)]fij(dxj). (2.5) 
m j ■/ 

0o is just a real number, which we denote by (po (m) . 

Remark that if some of the mi agree, say = m^+i = ■ ■ ■ = m/, k < I, then 4>k-\ is obtained 
from <pi by 



j=fc 



fc _i(xi,...,x fc _i) = — log / exp[m k (j)i(x 1 ,...,x k -i,x k ,...,xi)]Y\ /j,j(dxj) 

In particular, if all the m^ are 1, then 

^0 = log / exp [(j)} dfi. 



This latter case corresponds to the "replica symmetric" situation. Put 

Parisi (m, 0) d = V " + ( m ) _ l og 2 (2.6) 

't=i m.j 

Theorem 2.5. Assume that <fi : S — > R zs bounded and continuous. Then 

J* ((j)) = inf Parisi (m, 0) . (2.7) 

meA 

The expression for J* (0) in this theorem is very similar to the Parisi formula for the SK- 
model. Essentially the only difference is the first summand which in the SK-case is a quadratic 
expression. In our case (in contrast to the still open situation in the SK-model), we can prove 
that the infimum is uniquely attained, as we will discuss below. 

The derivation of the theorem from Corollary 12.41 is done by identifying first the possible 
maximizers in the variational formula for J* (0). They belong to a family of distributions, 
parametrized by m. The maximizer inside this family is then obtained by minimizing m accord- 
ing to ()2.7p . and one then identifies the two expressions. The procedure is quite standard in 
large deviation situations. 

Two conventions: C stands for a generic positive constant, not necessarily the same at different 
occurences. If there are inequalities stated between expressions containing N, it is tacitely 
assumed that they are valid maybe only for large enough N. 

3. Proofs 

3.1. The Gibbs variational principle: Proof of Theorem 12.31 If A G S, we put H(A \ 

n) = f ml V £AH(v | fj,). If S is a Polish Space, and S its Borel a- field, then it is well known that 
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v —7- H(v | /i) is lower semicontinuous in the weak topology. This follows from the representation 



H(v | fi) = sup 



udu — log / e u dfi 



(3.1) 



where U is the set of bounded continuous functions 5 — > R. 

For (S,S), (S',S') two Polish Spaces, and v G Mt(S x 5'). If G .M^S), // € A^(S') we 
have, 

H(v\n®fi')=H \^+H(v\ ® //) , (3.2) 
where i/W is the first marginal of z/ on S. 

Lemma 3.1. H(u j z^ 1 ) ® //) is a lower semicontinuous function of v in the weak topology. 
Proof. Applying f)3. 1 1) to 



udu - log / e u d (u^ ® jA 



H(u j z^ (1) ® //) = sup 

where W denotes the set of bounded continuous functions S x S' — s> K. For any fixed u £U, both 
functions u ^ J udu and f — > log J e n d (z/ 1 ) ® p') are continuous, and from this the desired 
semicontinuity property follows. □ 

We will need the following "relative" version of Sanov's theorem. Consider three independent 
sequences of i.i.d. random variables (Xi), (Yi), (Zi), taking values in three Polish spaces S , S , 
and with laws p, //, fi". We consider the empirical processes 

def 1 N 1 N 

i=l i=l 

The pair (L N ,R N ) takes values in Mf(S x 5') x _M|(S x S"). 
Lemma 3.2. T/ie sequence (Ln,Rn) satisfies a LDP with rate function 

' H (i/W | //) + ff (i/ | z/W (8) p') + H (9 | 0W ® //') , if i/W = #W 



J{v,e) 



oo otherwise. 



Proof. We apply the Sanov theorem to the empirical measure 

N 



N 
i=i 

We use the two natural projections p : S x S' x S" ^ S x S' and q : S x S' x S" ^ S x S" . 
Then (Ln,Rn) = Mat(p, g)" 1 , and by continuous projection, we get that (Ln,Rn) satisfies a 
good LDP with rate function 

J>, 0) = inf {iJ(p | /x ® // ® //') : pp" 1 = i/, p^ 1 = 0} . 

It only remains to identify this rate function with the function J given above. 

Clearly J'(u,9) = oo if u^ ^ flW. Therefore, assume z>' ' = 9^ >. If we define p(u,9) 6 
.M] 1 " (S x S' x S") to have marginal 

= ^(l) on 

5", and the conditional distribution on S" x <S"' 
given the first projection is the product of the conditional distributions of v and 9, then applying 
twice (j3.2|) . we get 

H(p | p ® p' ® p") = (V 1 ' | /x) + JT (i/ | z/ (1) ® p') + if (fl j ® p" 

and therefore J > J' . 
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To prove the other inquality, consider any p satisfying pp" 1 = v, pq" 1 = 9. We want to show 
that J(is, 9) < H (p | p, ® p! ® p"). For that, we can assume that the right hand side is finite. 
Then 

H(p\p®p!®p") =H(p\p(v,0))+ [ dplog 



The first summand is > 0, and the second equals 

dp(u,9) 



dp (v, 6) log 



So, we have proved that 



d (p ® pi ® p") 
J(u, 9) <H(p\p® p' ® p") 



d (p ® p' ® 
J(M)- 



□ 



for any p satisfying pp 1 = v, pq 

We now step back to the setting of Theorem 12,31 For j = l,...,n, we have sequences 
a . 4 | of independent random variables with distribution pj. We emphasize that hence- 
forth p, = p,\ ® • • • ® p, n and py) will denote the marginal on the first k components. Moreover, 
for q = (cki, . . . , a n ), we write a^' = (a±, . . . ,aj) and set 

N 



L 



jV,aW) jY 



X 2 X 3 ' 



for j < n, which is the marginal of Lat )Q on 5 J . With the notation 



X 



(j) def 



a. « 



we can write 



a\,...,ctj+\,V ' ' ' ' / ' 



A? 



2=1 v 7 



(3.3) 



For A C M|(5 n ) we put M N (A) d = # {a : L N>a G A} 



Lemma 3.3. Assume v € A4^"(5 ri ) satisfies H(v \ p) < oo, and let V be an open neighborhood 
of v, and e > 0. Then there exists an open neighborhood U of v, U C V, and 5 > such that 



M N (U) > exp [N (log 2 - | p) + e) 



< e 



-(5 AT 



Proof. If B r (v) denotes the open r-ball around f in one of the standard metrics, e.g. the 
Prohorov metric, then by the semicontinuity property of the relative entropy, on has 

H{B r {v) | p)tH{u | p) 

as r I 0. We can choose a sequence > 0,rfc | with H(B rk (u) \ p) = H(c\(B Tk (v)) \ p) f 
if (z/ | /x). Given e > 0, and V, we can find /c such that 

ff (£ r » | p) = H{c\{B rk {v)) | M ) > H (u | M ) - e/A 

and B rk {u) C V. By Sanov's theorem we therefore get 



LN,a € B Tk (v) 



<exp[N(-H(u | /i) + e/2)], 



and therefore 



E 



(S rfe (i/)) < exp [JV(log 2-H(v\p) + e/2)] . 
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By the Markov inequality, the claim follows by taking 5 = e/3. 



7 
□ 



Lemma 3.4. Assume v G A4f(S n ) satisfies H ) > Tj log 2 for some j < n, and let 

V be an open neighborhood of v. Then there is an open neighborhood U of v , U CV and 5 > 
such that 

¥[M N (U) ^0] < e~ SN 

for large enough N. 

Proof. As in the previous lemma, we choose a neighborhood U' of u^> in such that H (cl (£/') | 



H(U' | fi^') > Tj log 2 + n, for some ?] > 0. Then we put 



U = [v G Mf{S n ) :ueV, u® G U'} . 



If L N , a eU then L^ } Q G 



[3a : L NtCe eU]< 



3a : Lg } Q G 17' 



< 2 r ^I 



N,a G U 



<2 r ^exp 



iVF (cl ([/') j (i®) + Nn/2 
< 2 V * N exp [-NTj log 2 - Nrj/2] = e~ Nr,/2 



This proves the claim. 



□ 



Lemma 3.5. Assume that v G Mf(S n ) satisfies H [y^> \ fi^n < Tj log 2 for all j, and let V be 
an open neighborhood of v, and e > 0. Then there exists an open neighborhood U of v, U C V , 
and a 5 > such that 



< e 



-SN 



F M N (U) < exp [N (log 2 - H(y \ /x) - e) 

Proof. We claim that we can find U as required, and some 5 > 0, such that 

var [M N (U)] < e~ 2NS {E [M N (U)}} 2 (3.4) 
From this estimate, we easily get the claim: From Sanov's theorem, we have for any \ > 

EM N {U) = 2 N F (L N:a eU)> exp [N (log 2 - H{v \ fi) - *)] . (3.5) 
Using this, we get by taking x = e /2 

F (Mn(U) < e m^ S 2-H(u\n)-e)\ 

= F (M N (U) - EM N (U) < e -Ne/2 e N(log2-H(vM-e/2) _ EMn (jj) 

(M N (U) - EM N (U) < (e- Ne/2 - l) EM N {U) 



< 
< 

< 



M N (U) - EM N (U) < --EM N (U) 



(\M N (U)-EM N (U)\ > \eM n {U) 



< 1 var[M JV ([/)] < 4e _ 2NS < e _ 57V 
{EM N (U)} 2 ~ 
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So it remains to prove (|3.4|) . We first claim that for any j 

l^ P ,ee c w r (u): P U)=eU) {H(P \li) + H(e\ 9® ® /#>) } (3-6) 
= H(y \n) + H(v\ <g> p {j) ^j , 

where p^ = f Pj+i ® ' ' ' ® A*n- The inequality < is evident by taking p = 9 = u, and the opposite 
follows from the semicontinuity properties: One gets that for a sequence (p n , 9 n ) with = 9^ 
and p n ,9 n — > u, we have 

liminf i7 (p n \ p) > | /x), 

n— »oo 

liminf # f n | <8> A (i) ) > H (u \ <g> , 
the first inequality by the standard semi-continuity, and the second by Lemma I3.U This proves 



Choose rj > such that H (v^>> \ p^)\ < Yj log 2 — r/, for all 1 < j < n. By (|3.6|) we may 
choose r small enough such that cli? r (^) C V, and for all 1 < j < n, 

{h( p \p) + h{9\ 0®®p$\\ 

> H(v \p) + h[u\ <8> - rj/2 

= 2H(u \p)-H (u® | p (j) ) - 77/2 

> 2H(u I p) -T,,- log 2 + 77/2. 

For two indices a, a' we write q(a, a') = f max {j : = a'^j with max0 = f 0. Then 

n 

EM 2 N (U) = Y, F i LN ' a G U > Ln ^' G U ] 

j=0 a,a' :q(a,a')=j 

r[L N>a eU]F[L N>a , eu] 

a,a' :q(a,a')=0 
n 

j=l a,a' :q(a,a')=j 

< E[M N {clU)} 2 + 

n 

j=l a,a':q(a,a')=j 

We write the empirical measure in the form (j3.3|) . and use Lemma 13.21 For any 1 < j < n we 
have 

J2 w[L Nja ec\u,L Nta , eciu) 

a, a' :q(a,a')=j 

= 2 r 3 N 2 {l-T 3 )N Ul-T 3 )N _ A p G cl ^ L7ViQ , G clf/ ] ; 
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where on the right hand side a, a' is an arbitrary pair with q{a,a') = j. Using Lemma 13.21 we 
have 



[L N>a G clU, L Nta G clU] 



< 



exp 



iVinf 



p.eecit/.pO): 



+ H(p\ p {j) (g> fi ij) 



+ 



exp 



Ni] 



<2 r ^exp 



-2NH(v | p) - ^ 



and thus 



^ P [L^a G cl [/, L^a G cl U] < 2 2N exp 

OL.ct' :q(a,a')=j 



-2NH(v | p) 



Nrj 



-2NH(u | M ) - ^ 



□ 



Combining, we obtain by taking x = i?/16 in ()3,5|) 

var [Mjv(17)] < 2 27V exp 

which proves our claim. 

Proof of Theorem \2.3l We set 

G d ^ [v G .M+(S re ) : if (yU) | < Fj log 2, j = 1, . . . ,n] , 

which is a compact set. 

5*iep 1. We first prove the lower bound. By compactness of Q and the semicontinuity of H 
there exists £ G such that 

sup{$(z>) - H{y | (j,)} = $0o) - fl>o I A-0- 

We set z/ A = (1 — A)i/o + Xp for < A < 1. By convexity of H(u \ p) in v we see that 
H ( i/r^ | //W ) < r, log2 for all 1 < j < n. Furthermore i/ A — ^ weakly as A — > 0, and 



$(i/ A ) -»• $(^o), H{y\ | ft) -»• iT(z/ | //). 
Given e > we choose A > such that 

$(i/ A ) - I /^) > $(no) - | A*) - e- 
By the continuity of $ and Lemma 13.51 we find a neighborhood C7 of u\, and 5 > such that 

HO) - <S>(v x ) <e, 9eU, 

and 

P [M N {U) < 2 N exp l-NH(v x \ p) - Ne]] < e~ SN , 
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Then, with probability greater than 1 — e~ SN , 

Z N = 2~ N Y,^V [N$(L Nta )] 

a 

>2~ N exv[N$(L N , a )] 

> exp [JV*(i/ A ) - Ne] exp [-NH(t/\ | fi) - Ne] 
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> exp 



Nsup{<S>(v) - H{u | //)} - 3Ne 



By Borel-Cantelli, we therefore get, as e is arbitrary, 

lim inf \- log Z N > sup{3>(z/) - iT(z/ | //)} 

N^oo N v( zg 

almost surely. 

Step 2. We prove the upper bound. Let again e > and set 

g = {u:H{v\^)<\og2}. 

If v G <? we choose rv > such that |<£(0) - < e, 6* G B Tv {y) and 

P [M^B^i/)) > 2 Jv exp [-NH{v \ fi) + iVe]] < e~ N5v , 

for some b v > and large enough iV (using Lemma 13 . 3() . If v G Q \ Q we choose r v such that 
|$(0) - < e, G B r „(i/), and 



'[Mjv^HJ^O] <e 



-NS V 



(3.7) 



again for large enough N (and by Lemma l3.4p . As Q is compact, we can cover it by a finite 
union of such balls, i.e. 

m 

def r def 

where r,j = r v .. We also set b — miiij by - . Wg then est illicit 6 



Z N <2~ N J2 Yl exp[iV$(L iV)C( )]+2- JV exp [m(L N , a )] . (3.8) 

1=1 a:L N<a eB ri {u{) a:L N>a £U 

we first claim that almost surely the second summand vanishes provided N is large enough, i.e. 
that there is no a with Ln iOC ^ U. By Sanov's theorem, we have 

limsup-^logP[L7V ;a <£U]< - vni v £uH{v \ n) < - log 2. 



Therefore, almost surely, there is no a with -Ltv,o ^ U, and therefore the second summand in 
(|3.8p vanishes for large enough N, almost surely. The same applies to those summands in the 
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first part for which v% ^ G, using ()3.7|) . We therefore have, almost surely, for large enough N, 

Z N <2~ N J2 eX P l N $( L N,a)] 

hvi^Q a:L N>ol eB ri (vi) 

<e N£ exp[iV*(^)]M^(B ri (^)) 

< e 2N£ Y ex P exp [—NHfa | fi)] 



< e 2Ne mexp 



1 



iVsup {<3?(V) — H(u \ fi)} 



As e is arbitrary, we get 



This finishes the proof of Theorem 12.3 



lim sup — log Zjy < sup [<&{v) — H(v \ fi)] . 

iV->oo N u( zg 



□ 



3.2. The dual representation. Proof of the Theorem 12.51 We define a family Q (<fi) = 
{G^ m} of probability distributions on S n which depend on the parameter m = (mi, . . . , m n ) G 
A. The probability measure G = G^m is described by a "starting" measure 7 on S, and for 
2 < J ' < n Markov kernels Kj from S- 7-1 to S, so that G is the semi-direct product 

G = 7 ® K 2 <8> ■ ■ ■ ® K n . 

def exp [mi 0i (x)] //1 (dsc) 



7 (cte 



exp [mi^o] 
de f exp [mjfij (x^)] fij {dxj 



111 



exp [mj(j)j-i (x(J *))] 

where we write x( J ) = f (xj, ... ,Xj) . Remember the definition of the function <pj : — ) 
(|2.4|) . (|2.5p . It should be remarked that these objects are defined for all m £ M. n , and not just 
for m G A. We also write 

d = 7 <g> K 2 ® • • • ® i^- 

which is the marginal of G on 5 3 . In order to emphasize the dependence on m, we occasionally 



will write 



, 7m) -Kj.m etc. 



We remark that by a simple computation 

J H (Kj (x^ 1 ),-) I fij) G^ (dx^ 1 )) 

<PjdG u) - [ h^dG^- 1 ) 



(3.9) 



m, 



V31 ■ • ■ 1 Yn 



do not depend on my, but 



),..., q>j_i 



do. Differentiating the equation 



m r+ i<; 



(/// 



r+l 



with respect to m,-, we get for < r < j — 2 



r+ i (xW,i r+ i) 



9m,- 



K r+ i ((ix (r) , x r+ i 



(3.10) 
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and for r = j — 1 



drrij 



i.e. 



Pj-i 



dm a 



= J_ 

1714 



"J x ' ""J 
Combining that with (|3.9p . (|3.10p we get 

1 

m 



-l x 



(i-i) 



!>0,m 



9m, 



(3.11) 



Theorem 12.51 is immediate from the following result: 

Proposition 3.6. Assume that <f> : S n — > R is bounded and continuous. Then there is a unique 
measure u maximizing Gibbs (y, <j)) under the constraint v E fXj =1 TZj. This measure is of the 
form v = G(f, )m where m is the unique element in A minimizing \2. 7[ ). For this m, we have 

Gibbs (G, 4>) = Parisi (<f>, m) . (3.12) 

Proof. From strict convexity of the relative entropy, and the fact that f]j=i T^j 1S compact and 
convex, it follows that there is a unique maximizer v of Gibbs {y, (j)) under this constraint. 

Also, a straightforward application of Holder's inequality shows that Parisi (cf>, m) is a strictly 
convex function in the variables 1/m.j. Therefore, it follows that there is a uniquely attained 
minimum of Parisi ((/>, m) as a function of m € A. This minimizing m = (mi, . . . , m n ), we 
can be split into subblocks of equal values: There is a number K, < K < n, and indices 
< ji < j2 < • ■ ■ < ji< < n such that 



< mi = • 

< m j 2 +i ■ 

< m jK + l 



= m jl < m jl+1 
< m jK _ 1+1 = ■ ■ 
■ ■ ■ m„. = 1. 



m 



.12 



K = just means that all m, = 1. If jx = n, then all m, are < 1. We write G = G^^ m . 
/,From (|3.1ip . we immediately have 



d Parisi (6, m) 



1 



dm j 



m 



H 



Set d 



dcf 



J 



j 



(i-i) .) 



(iv,(x(^),.)| Mj )G^) (dx(^))- 7j 



log 2 



(3.13) 



Gm (dx.^ x )) .We use ()3.13p and the minimality of Parisi ( 

j up and down locally, 



•jr+l 



'Jr+ 



at m. We can perturb m by moving a whole block m 
without leaving A, provided it is not the possibly present block of values 1. This leads to 

jr+l jr+l 

^2 di = log 2 ji 



i=jr + l 



i=j r +l 



Furthermore, we can always move first parts of blocks, say m, r +i 
down, without leaving A, so that we get 



m k , k < j r+i locally 



Jk 



Jk 



^ <log2 7<- 
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These two observations imply 

n K 

Gep^nf|^=. (3.14) 

j=l r=l 

We next prove 

Gibbs (u, <j>) < Gibbs (G, 4>) (3.15) 

for any u € D?=l ^j- 

We first prove the case n = 1. If m < 1, then 

H(G | /x) =log2> | fj,) 

by (|3.14|) and the assumption v € T^i. Therefore, in any case 



1 cbdG 


- —H (G 

m 


A*) 


j (pdv 


- —H(v 

m 




—H(u 
m 


| G) > 





The general case follows by a slight extension of the above argument. Put 

D k d ^ [ fadGW - —h(g^ I - / fadpW + (>) | 

y "ifc+i v y J m k+i v 

D u ^ o, D n = Gibbs (G, 0) - Gibbs (i/, 0) . We prove -Cfc-i < D k for all fc, so that the claim 
follows. Remark that as above in the n = 1 case, if m k < m k+ i, then H (G^ fc+1) | /i( fc+1 )) = 
Tfc log 2, and therefore, in any case 



D k > f fadO™ - J-n(GW I »W) - [ 4> k dv^ + — H U k ) 
J rn k \ J J m k V 

= f fa-idG^-V - ± H (g^ I ^) - [ 4> k dv^ + —H (>) 
J m k \ J J m k V 



As 



H (>) | //(*)) -m k J faduW +m k J ^di/*" 1 ) 

= h i + / log " (rfXfc|x — ii™: — >) («*<*>) 



(|3,15p is proved. 

(I3T4|) and (|3TT5l identify G = G^ m as the unique maximizer of G (•, (f>) under the constraint 

The identification ()3.12p comes by a straightforward computation. □ 
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